Semi-supervised graph partitioning with decision trees.
نویسندگان
چکیده
In this paper we investigate a new framework for graph partitioning using decision trees to search for sub-graphs within a graph adjacency matrix. Graph partitioning by a decision tree seeks to optimize a specified graph partitioning index such as ratio cut by recursively applying decision rules found within nodes of the graph. Key advantages of tree models for graph partitioning are they provide a predictive framework for evaluating the quality of the solution, determining the number of sub-graphs and assessing overall variable importance. We evaluate the performance of tree based graph partitioning on a benchmark dataset for multiclass classification of tumor diagnosis based on gene expression. Three graph cut indices will be compared, ratio cut, normalized cut and network modularity and assessed in terms of their classification accuracy, power to estimate the optimal number of sub-graphs and ability to extract known important variables within the dataset.
منابع مشابه
A Combinatorial View of Graph Laplacians
Discussions about different graph Laplacians—mainly the normalized and unnormalized versions of graph Laplacian—have been ardent with respect to various methods of clustering and graph based semi-supervised learning. Previous research in the graph Laplacians, from a continuous perspective, investigated the convergence properties of the Laplacian operators on Riemannian Manifolds. In this paper,...
متن کاملBuilding Classifiers With Unrepresentative Training Instances: Experiences From The KDD Cup 2001 Competition
In this paper we discuss our experiences in participating in the KDD Cup 2001 competition. The task involved classifying organic molecules as either active or inactive in their binding to a receptor. The classification task presented three challenges: highly skewed class distribution, large number of features exceeding training set size by two orders of magnitude, and non-representative trainin...
متن کاملA Combinatorial View of the Graph Laplacians
Discussions about different graph Laplacians, mainly normalized and unnormalized versions of the graph Laplacians, have been ardent with respect to various methods in clustering and graph based semi-supervised learning. Previous research on the graph Laplacians investigated their convergence properties to Laplacian operators on continuous manifolds. There is still no strong proof on convergence...
متن کاملTransactions on Machine Learning and Data Mining Editorial
In this volume we have to discuss two papers: Both papers are very interesting, innovative and clearly written. These papers have something in common. This is the use of specific elements and the search for them. Such elements are the starting point for learning structures and replace simple but complex searches. Both papers also make use of graphs for the representation. Related problems have ...
متن کاملEfficient Learning of Random Forest Classifier using Disjoint Partitioning Approach
Random Forest is an Ensemble Supervised Machine Learning technique. Research work in the area of Random Forest aims at either improving accuracy or improving performance. In this paper we are presenting our research towards improvement in learning time of Random Forest by proposing a new approach called Disjoint Partitioning. In this approach, we are using disjoint partitions of training datase...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Genome informatics. International Conference on Genome Informatics
دوره 20 شماره
صفحات -
تاریخ انتشار 2008